SMDM Europe Worked Examples

Questions and To-Do

  • [Shawn] In our appendix example, we have a two-cycle tunnel state (T1 and T2) once you become ill. We allow for exits to background mortality from T1, but everyone in T2 exits to the receiving bucket “N”. I assume we don’t need exits to background mortality because we’re only using these for accounting purposes – but if it’s for a transient utility decrement, don’t we only need to apply the utility decrement to the non-decedents within T2?

  • [Shawn] Interestingly, in the HIV model, when I add in background mortality for Germany (based on modeled life table data), the resulting transition probability matrix (for an initial cohort of 40 year olds) results in a negative probability for transitions to secular death. I had to add in a parameter which is a hazard ratio for the HIV population (essentially multiplying the background mortality rate by 2) to get positive probabilities. But could use this as an example to show that embedding from the solved generator matrix doesn’t guarantee non-zero probabilities?

  • [Shawn] Need the optimization algorithm to solve for PSA parameters that minimizes:

(F(x_1) - p_1)^2 + (F(x_2)-p_2)^2

  • [Shawn]. Can we also add analytic solution code for lognormal and logistic? Or just rely on the optimization?

  • [All] Do we need a full-fledged CEA for the CVD model? Or would just creating the natural disease model suffice?

  • [Astrid] The life table data at the global burden of disease website is split into different files for different age buckets. Can we download these files and aggretate them together to get a full life table for each country/region/year combo?

  • [Astrid] Perhaps we could create a Shiny app that does the above?

    • Input: country/region and year
    • Output 1: Constructed life table
    • Output 2: Parameters for mortality model

0. Setup

We begin by specifying a parameter object that will ultimately guide the discrete time model.

params <- list(
  n_age_init = 55,
  n_age_max = 65,
  d_c = 0.03,
  d_e = 0.03, 
  n_sim = 1000,
  r_HS = 0.1,
  hr_SD = 1.9, # hazard ratio on rate
  # overall death
  # p_HD is the time-varying background mortality
  # p_SD is the time-varying background mortality * 1.9(Hazard Ratio)
  hr_HS_trt = 0.5,
  u_H = 1,
  u_S = 0.773,
  u_D = 0,
  c_H = 8494, # Non-CVD health care cost
  c_trt = 840,
  c_S = 3917 + 8494, # Cost of annual follow-up post ASCVD
  c_nonfatal_firstYr = 49348,
  c_fatal_firstYr = 16760,
  c_D = 0,
  cycle_length = 1,
  n_cycles = 10
)

v_names_states = c("Healthy", "Sick", "Death")
n_states = length(v_names_states)
v_names_str = c("quo","trt")
n_strategies = length(v_names_str)

1. Mortality Modeling (Alive-Dead)

Estimate a mortality model to parameterize background mortality.

Background mortality will be based on a mortality model. This reduces the parameterization from the total number of age bins in the life table data to only a few parameters that can be used to characterize background mortality.

First we download and store the vital statistics data for the US from the human mortality database:

Next we construct a life table from these data. This is done using the demography package.

#############################################
# Define Parameters for Life Table Modeling
#############################################
mort_year =  # Year to obtain from Human Mortality Database
  2019
N = # Population basis for life table
  100000
max_age =  # Max age in life table
  99
min_age =  # Min age in life table
  0
n_cycles =  # Number of discrete time (annual) cycles in Markov
  100

###############################
# Underlying Life-Table Data
###############################

lt <-  # Read in the U.S. life table data from the human mortality database.
  read_rds(here("_sandbox/mortality/usa-life-table.rds")) %>% 
  lifetable(.,series = "total", years = mort_year) %>% 
  as_tibble() %>% 
  mutate_at(vars(lx,dx), function(x) x*N) %>% 
  mutate(country = "USA") %>% 
  mutate(age = x)

lt %>% 
  ungroup() %>% 
  select(-x) %>% 
  select(country,age,everything()) %>% 
  head() %>% 
  kable() %>% 
  kable_styling()
country age year mx qx lx dx Lx Tx ex
USA 0 2019 0.00556 0.00553 100000 553.221 0.99482 78.963 78.963
USA 1 2019 0.00038 0.00038 99447 38.180 0.99428 77.968 78.402
USA 2 2019 0.00023 0.00023 99409 23.160 0.99397 76.974 77.432
USA 3 2019 0.00018 0.00018 99385 17.689 0.99377 75.980 76.450
USA 4 2019 0.00014 0.00014 99368 14.209 0.99361 74.986 75.463
USA 5 2019 0.00013 0.00013 99354 13.213 0.99347 73.993 74.474

The columns here are

  • age: Ages for lifetable
  • year: Period years or cohort years
  • mx: Death rate at age x.
  • qx: The probability that an individual of exact age x will die before exact age x+1.
  • lx: Number of survivors to exact age x. This is defined relative to a radix, or the size of a cohort from which the life table is derived).
  • dx: The number of deaths between exact ages x and x+1.
  • Lx: Number of years lived between exact age x and exact age x+1.
  • Tx: Number of years lived after exact age x.
  • ex: Remaining life expectancy at exact age x.

Our next step is to fit a mortality model to these data. Generally speaking, we need three inputs:

  • age: Ages for lifetable
  • dx: The number of deaths between exact ages x and x+1.
  • lx: Number of survivors to exact age x. This is defined relative to a radix, or the size of a cohort from which the life table is derived).

The MortalityLaws package has a number of mortality models we can draw from. Table 1 summarizes these.

laws <- availableLaws()
laws$table %>% 
  arrange(TYPE) %>% 
  kable() %>% 
  kable_styling()
Table 1: Mortality Models in MortalityLaws package
YEAR NAME MODEL TYPE CODE FIT SCALE_X
1870 Opperman mu[x] = A/sqrt(x) - B + C*sqrt(x) 1 opperman mu[x] FALSE
1939 Weibull mu[x] = 1/sigma * (x/M)^(M/sigma - 1) 1 weibull mu[x] FALSE
NA Inverse-Gompertz mu[x] = [1- exp(-(x-M)/sigma)] / [exp(-(x-M)/sigma) - 1] 2 invgompertz mu[x] TRUE
NA Inverse-Weibull mu[x] = 1/sigma * (x/M)^[-M/sigma - 1] / [exp((x/M)^(-M/sigma)) - 1] 2 invweibull mu[x] TRUE
1825 Gompertz mu[x] = A exp[Bx] 3 gompertz mu[x] TRUE
NA Gompertz mu[x] = 1/sigma * exp[(x-M)/sigma)] 3 gompertz0 mu[x] TRUE
1860 Makeham mu[x] = A exp[Bx] + C 3 makeham mu[x] TRUE
NA Makeham mu[x] = 1/sigma * exp[(x-M)/sigma)] + C 3 makeham0 mu[x] TRUE
1932 Perks mu[x] = [A + BC^x] / [BC^-x + 1 + DC^x] 3 perks mu[x] TRUE
1960 Strehler-Mildvan mu[x] = K * exp[-V0 * (1 - Bx)/D] 3 strehler_mildvan mu[x] TRUE
1943 Van der Maen mu[x] = A + Bx + Cx^2 + I/[N - x] 4 vandermaen mu[x] TRUE
1971 Beard mu[x] = A exp(Bx) / [1 + KA exp(Bx)] 4 beard mu[x] TRUE
1971 Beard-Makeham mu[x] = A exp(Bx) / [1 + KA exp(Bx)] + C 4 beard_makeham mu[x] TRUE
1979 Gamma-Gompertz mu[x] = A exp(Bx) / (1 + AG/B * [exp(Bx) - 1]) 4 ggompertz mu[x] TRUE
1943 Van der Maen mu[x] = A + Bx + I/[N - x] 5 vandermaen2 mu[x] TRUE
NA Quadratic mu[x] = A + Bx + Cx^2 5 quadratic mu[x] TRUE
1998 Kannisto mu[x] = A exp(Bx) / [1 + A exp(Bx)] 5 kannisto mu[x] TRUE
1998 Kannisto-Makeham mu[x] = A exp(Bx) / [1 + A exp(Bx)] + C 5 kannisto_makeham mu[x] TRUE
1871 Thiele mu[x] = A exp(-Bx) + C exp[-.5D (x-E)^2] + F exp(Gx) 6 thiele mu[x] FALSE
1883 Wittstein q[x] = (1/B) A^-[(Bx)^N] + A^-[(M-x)^N] 6 wittstein q[x] FALSE
1979 Siler mu[x] = A exp(-Bx) + C + D exp(Ex) 6 siler mu[x] FALSE
1980 Heligman-Pollard q[x]/p[x] = A^[(x + B)^C] + D exp[-E log(x/F)^2] + G H^x 6 HP q[x] FALSE
1980 Heligman-Pollard q[x] = A^[(x + B)^C] + D exp[-E log(x/F)^2] + GH^x / [1 + GH^x] 6 HP2 q[x] FALSE
1980 Heligman-Pollard q[x] = A^[(x + B)^C] + D exp[-E log(x/F)^2] + GH^x / [1 + KGH^x] 6 HP3 q[x] FALSE
1980 Heligman-Pollard q[x] = A^[(x + B)^C] + D exp[-E log(x/F)^2] + GH^(x^K) / [1 + GH^(x^K)] 6 HP4 q[x] FALSE
1983 Rogers-Planck q[x] = A0 + A1 exp[-Ax] + A2 exp[B(x - u) - exp(-C(x - u))] + A3 exp[Dx] 6 rogersplanck q[x] FALSE
1987 Martinelle mu[x] = [A exp(Bx) + C] / [1 + D exp(Bx)] + K exp(Bx) 6 martinelle mu[x] FALSE
1992 Carriere l[x] = P1 l[x](weibull) + P2 l[x](invweibull) + P3 l[x](gompertz) 6 carriere1 q[x] TRUE
1992 Carriere l[x] = P1 l[x](weibull) + P2 l[x](invgompertz) + P3 l[x](gompertz) 6 carriere2 q[x] TRUE
1992 Kostaki q[x]/p[x] = A^[(x+B)^C] + D exp[-(E_i log(x/F_))^2] + G H^x 6 kostaki q[x] FALSE

Because we want to stay general (i.e., model all over the age spectrum), our first attempt will be a Gompertz model.

Gompertz doesn’t fit all age ranges well—particularly young ages. This is a well-known fact. If we were to focus only on adults, however, this would be a nice way to go.

Here, for example, is a Gompertz model fit to 40 year old adults:

The nice thing about this is that it summarizes age-specific mortality in terms of just two parameters. We can then feed these parameters into the gompertz() formula to yield a death rate for a given age. Here is the mortality rate at age 75 for an initial population of 40 year olds:

MortalityLaws::gompertz(x = 75 - 40, coef(gom_fit40))
$hx
[1] 0.030479

$par
        A         B 
0.0014654 0.0867125 

$Sx
[1] 0.71563

For this exercise we want to stay general, so let’s select an alternative model that can better fit the entire age range. To get a sense of the options, let’s select the models that are designed for the entire age range and see their fit:

Figure 1: Fitted mortality rate vs. observed rate, by model

Here we see that some do better than others. We’ll choose the HP2 (Heligman-Bollard) to fit, though note that any well-fitting mortality model can be used because there is a mapping of the fitted parameters to the mortality rate for each (see Table 1). The MortalityLaws package also has a function for each, much like the gompertz() function above. For our selected model, the function is HP2().

Let’s add the fitted mortality model parameters to our overall parameter list object. We’ll include gompertz , too, just for comparison later.

params <- modifyList(params,list(heligman_bollard = coef(hp_fit), 
                                 gompertz = coef(gom_fit)))

Verify that a discrete time Markov model can replicate life-expectancy at age X.

Now we’ll answer the question: can we replicate life expectancy based on a life table using a discrete time Markov model?

To do this, we’ll consruct a very simple two-state discrete time Markov model. We’ll then parameterize the model with the coefficients from the mortality models above, and then calculate life expectancy with a QALY “payoff” of 1.0 if alive and no discounting.

Let’s start with a cohort of 40 year olds. Based on the life table, we should expect these individuals to live an additional 41 years

starting_age = 40

tr_ <-  tr_alt_ <- # Start the Markov trace
  bind_rows(c("alive" = 1, "dead" = 0)) %>% mutate(t = 0) %>% select(t,alive,dead)

for (.x in 1:n_cycles) {
    r_death <- 
      HP2(.x + starting_age - 1, params$heligman_bollard)$hx
    
    ## Embedded Transition Probability Matrix
    m_Q <- # construct the transtiion rate matrix (2x2 for alive-dead)
        matrix(
            c(-(r_death), r_death, 0, 0),
            byrow = TRUE,
            ncol = 2,
            dimnames = list(c("alive", "dead"), c("alive", "dead"))
        )
    m_P <- # embed the transition probability matrix
        expm(m_Q)
    
    p_death <- 1 - exp(-r_death)
    
    ## Alternative version using standard rate-to-probability conversion formulas
    m_P_ <- 
      matrix(c((1 - p_death),p_death,0,1), byrow=TRUE,
             ncol=2,
             dimnames = list(c("alive","dead"),c("alive","dead")))
    
    tmp_ <-  # current state occupancy 
      c(tr_[.x, "alive"], tr_[.x, "dead"]) %>% unlist()
    
    tr_ <- # add next row of the Markov trace
        bind_rows(
            tr_,
            (tmp_ %*% m_P)  %>% 
              as.matrix() %>% 
              data.frame() %>% 
              mutate(t = .x) %>% 
              dplyr::select(t, alive, dead)
        )
    
    tmp_ <-  # current state occupancy 
      c(tr_alt_[.x, "alive"], tr_alt_[.x, "dead"]) %>% unlist()
    
    tr_alt_ <- # add next row of the Markov trace
        bind_rows(
            tr_alt_,
            (tmp_ %*% m_P_)  %>% 
              as.matrix() %>% 
              data.frame() %>% 
              mutate(t = .x) %>% 
              dplyr::select(t, alive, dead)
        )
    
}

# Cycle adjustment
cycle_adj <- 
    c(rep(c(4/3,2/3),n_cycles/2),1/3)
cycle_adj[1] <- 1/3

# Life-Expectancy via the discrete time Markov
payoff_lifeexp <- c(1,0)

life_exp_markov <- 
    t(cycle_adj) %*% tr_$alive 

life_exp_markov_alt <- 
    t(cycle_adj) %*% tr_alt_$alive 

# Life-Expectancy via the life table
life_exp_lifetable <- 
  lt %>% filter(age==starting_age) %>% pull(ex)

tibble(markov = as.vector(life_exp_markov), 
       markov_alt = as.vector(life_exp_markov_alt),life_table = life_exp_lifetable) %>% 
  kable(digits = 2) %>% 
  kable_styling()
markov markov_alt life_table
40.73 40.73 41

We’ll now loop through various starting ages to verify that we can replicate life expectancy with a simple discrete time Markov model.

Figure 2: Comparison of Modeled Life Expectancy from Mortality Models to Life Expectancy from Life Table

Figure 2 Shows that modeled life expectancy based on the Helligman-Bollard approach successfully replicates average life expectancy for nearly any age cohort. The figure also shows that the Gompertz model performs well among middle-aged adult cohorts (e.g., 40) but not for younger or older age cohorts.

2. Incorporating a Disease State (Healthy-Sick-Dead)

Split “Alive” health state into healthy vs. sick

Let’s now expand our simple model to include a “sick” state. For now we’ll assume you’re either healthy or sick, but that being sick does not confer a higher mortality rate:

Verify that a discrete time Markov model can replicate life-expectancy at age X.

Starting with a cohort aged 40, can we successfully replicate life expectancy again?

starting_age = .x = 40

tr_ <- tr_alt_ <- 
  bind_rows(c("healthy" = 1, "sick" = 0, "dead" = 0)) %>% mutate(t = 0) %>% select(t,healthy,sick,dead)

for (.x in 1:n_cycles) {
    
   r_death <- HP2(.x + starting_age - 1, params$heligman_bollard)$hx
    
    # Embedded Version
    m_Q <-
        matrix(
            c(-(r_death + params$r_HS), params$r_HS, r_death,
                0, -(r_death), r_death,
              0,0,0),
            byrow = TRUE,
            ncol = 3,
            dimnames = list(c("healthy", "sick","dead"), c("healthy", "sick","dead"))
        )
    
    m_P <-
        expm(m_Q)
    
    # Using Formulas to Convert
    p_death <- 1 - exp(-r_death)
    p_sick <- 1 - exp(-params$r_HS)
    
    m_P_ <- 
      matrix(c(
       (1-p_death - p_sick), p_sick, p_death,
       0,(1-p_death),p_death,
       0,0,1
      ),
      byrow=TRUE,
      ncol = 3,
            dimnames = list(c("healthy", "sick","dead"), c("healthy", "sick","dead"))
      )
    
    tmp_ <- c(tr_[.x, "healthy"], tr_[.x, "sick"], tr_[.x, "dead"]) %>% unlist()
    
    tr_ <-
        bind_rows(
            tr_,
            (tmp_ %*% m_P)  %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,dead)
        )
    
    tmp_ <- c(tr_alt_[.x, "healthy"], tr_alt_[.x, "sick"], tr_alt_[.x, "dead"]) %>% unlist()
    
    tr_alt_ <-
        bind_rows(
            tr_alt_,
            (tmp_ %*% m_P_)  %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,dead)
        )
    
}

# Cycle adjustment
cycle_adj <- 
    c(rep(c(4/3,2/3),n_cycles/2),1/3)
cycle_adj[1] <- 1/3

# Life-Expectancy via the discrete time Markov
payoff_living <- matrix(c(1,1,0),byrow=TRUE,ncol=1)
living <- as.matrix(tr_[,c("healthy","sick","dead")]) %*% payoff_living
living_alt <- as.matrix(tr_alt_[,c("healthy","sick","dead")]) %*% payoff_living

life_exp_markov <- 
    t(cycle_adj) %*% living 

life_exp_markov_alt <- 
    t(cycle_adj) %*% living_alt

# Life-Expectancy via the life table
life_exp_lifetable <- 
  lt %>% filter(age==starting_age) %>% pull(ex)

tibble(markov = as.vector(life_exp_markov), 
       markov_alt = as.vector(life_exp_markov_alt),life_table = life_exp_lifetable) %>% 
  kable(digits = 2) %>% 
  kable_styling()
markov markov_alt life_table
40.73 40.73 41

3. Cause-Specific Death

Show that if we assume a static hazard ratio for death (applied to the secular death rate) we will dramatically understate overall life expectancy.

What happens if we just put in a cause-specific hazard ratio that acts on the background mortality rate?

starting_age = .x = 40

tr_ <- tr_alt_ <- 
  bind_rows(c("healthy" = 1, "sick" = 0, "dead" = 0)) %>% mutate(t = 0) %>% select(t,healthy,sick,dead)

for (.x in 1:n_cycles) {
    
   r_death <- HP2(.x + starting_age - 1, params$heligman_bollard)$hx
    
      # Correct Version
    m_Q <-
        matrix(
            c(-(r_death + params$r_HS), params$r_HS, r_death,
                0, -(r_death * params$hr_S), r_death* params$hr_S,
              0,0,0),
            byrow = TRUE,
            ncol = 3,
            dimnames = list(c("healthy", "sick","dead"), c("healthy", "sick","dead"))
        )
    
    m_P <-
        expm(m_Q)
    
    # Using Formulas to Convert
    p_death <- 1 - exp(-r_death)
    p_sick <- 1 - exp(-params$r_HS)
    p_sick_death <- 1 - exp(-(r_death * params$hr_S))
    
    m_P_ <- 
      matrix(c(
       (1-p_death - p_sick), p_sick, p_death,
       0,(1-p_sick_death),p_sick_death,
       0,0,1
      ),
      byrow=TRUE,
      ncol = 3,
            dimnames = list(c("healthy", "sick","dead"), c("healthy", "sick","dead"))
      )
    
    tmp_ <- c(tr_[.x, "healthy"], tr_[.x, "sick"], tr_[.x, "dead"]) %>% unlist()
    
    tr_ <-
        bind_rows(
            tr_,
            (tmp_ %*% m_P)  %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,dead)
        )
    
    tmp_ <- c(tr_alt_[.x, "healthy"], tr_alt_[.x, "sick"], tr_alt_[.x, "dead"]) %>% unlist()
    
    tr_alt_ <-
        bind_rows(
            tr_alt_,
            (tmp_ %*% m_P_)  %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,dead)
        )
    
}

# Cycle adjustment
cycle_adj <- 
    c(rep(c(4/3,2/3),n_cycles/2),1/3)
cycle_adj[1] <- 1/3

# Life-Expectancy via the discrete time Markov
payoff_living <- matrix(c(1,1,0),byrow=TRUE,ncol=1)
living <- as.matrix(tr_[,c("healthy","sick","dead")]) %*% payoff_living
living_alt <- as.matrix(tr_alt_[,c("healthy","sick","dead")]) %*% payoff_living

life_exp_markov <- 
    t(cycle_adj) %*% living 

life_exp_markov_alt <- 
    t(cycle_adj) %*% living_alt

# Life-Expectancy via the life table
life_exp_lifetable <- 
  lt %>% filter(age==starting_age) %>% pull(ex)

tibble(markov = as.vector(life_exp_markov), 
       markov_alt = as.vector(life_exp_markov_alt),life_table = life_exp_lifetable) %>% 
  kable(digits = 2) %>% 
  kable_styling()
markov markov_alt life_table
34.88 34.92 41

We no longer approximate the ‘natural history’ life-expectancy! Uh oh.

Construct a cause-deleted life table, and use the cause deleted secular deaths instead.

Let’s get an age-specific cause of death data from the Global Burden of Disease project.

These data summarize, by age group, the percentage of overall deaths that are attributable to cardiovascular disease.

ihme_cvd <- 
tibble::tribble(
        ~age_name,        ~val,
               1L, 0.038771524,
               5L, 0.038546046,
              10L, 0.044403585,
              15L, 0.033781126,
              20L, 0.035856165,
              25L, 0.053077797,
              30L, 0.086001439,
              35L, 0.130326551,
              40L, 0.184310334,
              45L,  0.21839762,
              50L, 0.243705394,
              55L, 0.256334637,
              60L,  0.26828001,
              65L, 0.272698709,
              70L,  0.28529754,
              75L, 0.310642009,
               0L, 0.016750489,
              80L, 0.353518012,
              85L, 0.399856716,
              90L, 0.447817792,
              95L, 0.495305502
        ) %>% 
    mutate(age_ihme = cut(age_name,unique(c(0,1,seq(0,95,5),105)),right=FALSE))  %>% 
    select(age_ihme,  pct_cvd = val) 

We now need to take our life table data and bin it similarly to the IHME data.

# Source: https://grodri.github.io/demography/neoplasms

# The time to death wtihin intervals is drawn from the "a" column here. 
# We really only use it for the 0-1 year old age range. 
# url = "https://grodri.github.io/datasets/preston41.dat"
# b41 <- read.table(url, header=FALSE)
# names(b41) <- c("age","D","Di","lx","a")

edit.na <- function(x, value) { x[is.na(x)] <- value; x}
    
lt_ <-
    lt %>% 
    mutate(age_ihme = cut(age,unique(c(0,1,seq(0,95,5),105)),right=FALSE)) %>% 
    left_join(ihme_cvd,"age_ihme") %>%
    mutate(dx_i = round(dx * pct_cvd)) %>% 
    select(age_ihme,
           age,
           D = dx,  # Deaths
           Di = dx_i, # Cause-specific deaths
           lx = lx) %>% # Living
    mutate(a = ifelse(age_ihme == "[0,1)", 0.152, 0.5)) %>% 

    # The conditional probability of dying of a given cause given survival to 
    # the age group is easy to obtain, we just multiply the overall probability 
    # by the ratio of deaths of a given cause to all deaths.
        
    mutate(q = edit.na(1 - lead(lx)/lx, 1),
           qi = q * Di/D) %>% 
    
    # The unconditional counts of deaths of any cause and of a given cause 
    # are calculated multiplying by the number surviving to the start of each 
    # age group, which is lx. Recall that to die of cause i in the interval 
    # [x, x+n) one must survive all causes up to age x.

    mutate(d = lx * q, 
           di = lx * qi) %>% 
    
    # In preparation for the next part, note that if we had nmx and we were willing 
    # to assume that the hazard is constant in each age group we would have had a 
    # slightly different estimate of the survival function. Let us “back out” 
    # the rates from the probabilities:
        
    mutate(n = c(diff(age),NA), 
           m =  edit.na( q/(n - q * (n - a)), 1/tail(a,1))) %>%  # m[last] = 1/a[last]
    
    # With these rates we compute the cumulative hazard and survival as
    
    mutate(H = cumsum(n * m), 
           S = edit.na(exp(-lag(H)), 1)) %>%  # S[1] = 1
        
    # We compute cause-specific rates by dividing deaths of a given cause into person-years 
    # of exposure, which is equivalent to multiplying the overall rate by the ratio of 
    # deaths of a given cause to the total. Here we want deaths for causes other than 
    # neoplasms. I will use the subscript d for deleted:
        
    mutate(Rd = (D - Di)/D,
           md = m * Rd) %>% 
    
    # We compute the conditional probability of surviving an age group after 
    # deleting a cause as the overall probability raised to Rd, and then calculate 
    # the survival function as a cumulative product
    
    mutate(pd = (1 - q)^Rd,
           ld = 100000 * cumprod(c(1, pd[-length(pd)]))) %>% 
    
    # Then we construct a survival function in the usual way, but treating this 
    # hazard as if it was the only one operating:
        
    mutate(Hd = cumsum(n * md), 
           Sd = edit.na(exp(-lag(Hd)), 1)) %>% # Sd[1] = 1 
    mutate(Pd =  edit.na((Sd - lead(Sd))/md, tail(Sd/md, 1))) %>% 
  
  # Now do it for cause-specific death. 
  
  mutate(Ri = Di / D, 
         pi = (1 - qi)^Ri,
         li = 100000 * cumprod(c(1, pi[-length(pi)]))) %>% 
  mutate(mi = m - md)

bx <- mutate(lt_, agem = age + n/2, mi = m - md)[-nrow(lt_), ]


p_cd <- 
  bx %>% 
  select(agem,m,md,mi) %>% 
  gather(series,value,-agem) %>% 
  mutate(series = factor(series,levels = c("m","md","mi"),labels = c("All Cause", "Non-CVD","CVD"))) %>% 
  ggplot(aes(x = agem, y = value, colour = series)) + geom_line() + scale_y_log10() + 
  ggsci::scale_color_aaas() + 
  geom_dl(method = "smart.grid",aes(label=series)) + 
  scale_x_continuous(expand = c(0.5,0)) + 
  theme(legend.position = "none") + 
  labs(x = "Age", y = "Mortality Rate") + 
  geom_point(data = lt_ %>% mutate(series = "Life Table") %>% filter(age<100), aes(x = age, y = m), alpha =0.2, colour = "darkblue") + 
  geom_point(data = lt_ %>% mutate(series = "Life Table") %>% filter(age<100), aes(x = age, y = md), alpha =0.2, colour = "red") + 
  geom_point(data = lt_ %>% mutate(series = "Life Table") %>% filter(age<100), aes(x = age, y = m - md), alpha =0.2, colour = "darkgreen") 
  
p_cd

Figure 3: Mortality Rates

ages_     <- lt_$age[lt_$age<=max_age & lt_$age>=min_age]
deaths_   <- lt_$d[lt_$age<=max_age & lt_$age>=min_age] - lt_$di[lt_$age<=max_age & lt_$age>=min_age]
exposure_  <- lt_$lx[lt_$age<=max_age & lt_$age>=min_age]

hp_fit_ <- MortalityLaw(
                x  = ages_,
                Dx  = deaths_,   # vector with death counts
                Ex  = exposure_, # vector containing exposures
                law = "HP2",
                opt.method = "LF2")

Model using cause-specific death rate from the cause-deleted life table data.

Now let’s construct a model that adds in cause-specific death, and uses cause-deleted (modeled) mortality for secular death:

params <- modifyList(params, list(cause_deleted = coef(hp_fit_)))
starting_age = .x = 40

tr_ <- tr_alt_ <- 
  bind_rows(c("healthy" = 1, "sick" = 0, "cvddeath" = 0, "dead" = 0)) %>% mutate(t = 0) %>% select(t,healthy,sick,cvddeath,dead)

for (.x in 1:n_cycles) {
  
   current_age <- min(.x + starting_age - 1,max(lt_$age))
  
   r_death <- HP2(.x + starting_age - 1, params$cause_deleted)$hx
   r_cause <- lt_ %>% filter(age==current_age) %>% pull(mi)
    
    # Correct Version
    m_Q <-
        matrix(
            c(-(r_death + params$r_HS), params$r_HS, 0, r_death,
                0, -(r_cause+r_death), r_cause,r_death,
              0,0,0,0,
              0,0,0,0),
            byrow = TRUE,
            ncol = 4,
            dimnames = list( c("healthy", "sick","cvddeath","dead"), c("healthy", "sick","cvddeath","dead"))
        )
    
    m_P <-
        expm(m_Q)
    
    # Using Formulas to Convert
    p_death <- 1 - exp(-r_death)
    p_sick <- 1 - exp(-params$r_HS)
    p_sick_death <- 1 - exp(-(r_cause))
    
    m_P_ <- 
      matrix(c(
       (1-p_death - p_sick), p_sick, 0,p_death,
       0,(1-p_sick_death - p_death),p_sick_death,p_death,
       0,0,0,0,
       0,0,0,1
      ),
      byrow=TRUE,
      ncol = 4,
            dimnames = list(c("healthy", "sick","cvddeath","dead"), c("healthy", "sick","cvddeath","dead"))
      )
    
    tmp_ <- c(tr_[.x, "healthy"], tr_[.x, "sick"], tr_[.x, "cvddeath"],tr_[.x, "dead"]) %>% unlist()
    
    tr_ <-
        bind_rows(
            tr_,
            (tmp_ %*% m_P)  %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,cvddeath,dead)
        )
    
    tmp_ <- c(tr_alt_[.x, "healthy"], tr_alt_[.x, "sick"], tr_alt_[.x, "cvddeath"], tr_alt_[.x, "dead"]) %>% unlist()
    
    tr_alt_ <-
        bind_rows(
            tr_alt_,
            (tmp_ %*% m_P_)  %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,cvddeath,dead)
        )
    
}

# Cycle adjustment
cycle_adj <- 
    c(rep(c(4/3,2/3),n_cycles/2),1/3)
cycle_adj[1] <- 1/3

# Life-Expectancy via the discrete time Markov
payoff_living <- matrix(c(1,1,0,0),byrow=TRUE,ncol=1)
living <- as.matrix(tr_[,c("healthy","sick","cvddeath","dead")]) %*% payoff_living
living_alt <- as.matrix(tr_alt_[,c("healthy","sick","cvddeath","dead")]) %*% payoff_living

life_exp_markov <- 
    t(cycle_adj) %*% living 

life_exp_markov_alt <- 
    t(cycle_adj) %*% living_alt

# Life-Expectancy via the life table
life_exp_lifetable <- 
  lt %>% filter(age==starting_age) %>% pull(ex)

tibble(markov = as.vector(life_exp_markov), 
       markov_alt = as.vector(life_exp_markov_alt),life_table = life_exp_lifetable) %>% 
  kable(digits = 2) %>% 
  kable_styling()
markov markov_alt life_table
40.97 40.88 41
lt_ %>% 
  filter(age>=40) %>% 
  mutate(qd = 1 - exp(-md)) %>% 
  select(-pd) %>% 
  mutate(pd = 1 - qd)   %>% 
  mutate(cump = ifelse(row_number()==1,1,lag(cumprod(pd),1)) ) %>% 
  select(-lx) %>% 
  mutate(lx = N  * cump)  %>% 
  mutate(ditest = lx * qd) %>% 
  filter(age<=85) %>% 
  ggplot(aes(x = age, y = cump)) + geom_step() + 
  geom_line(data = tr_ %>% filter(t<=45), aes(x = starting_age + t, y = 1 - dead ), colour = "red") + labs(x = "Age", y = "Survival")

Survival by Age: Discrete Time Markov (Red) vs. Cause-Deleted Life Table (Black)

Good news: we essentially replicate overall life expectancy! Though note the small difference with the model based on formula conversions rather than embedding.

  • Note that we used the cause-deleted life table’s cause-specific mortality rate (mi) here. Another alternative would be to use a similar mortality model as above – but in practice, for CVD deaths this does not yield good predictions for old ages (see plot )
ages_     <- lt_$age[lt_$age<=max_age & lt_$age>=min_age]
deaths_i   <- lt_$di[lt_$age<=max_age & lt_$age>=min_age]
exposure_i  <- lt_$li[lt_$age<=max_age & lt_$age>=min_age]

hp_fit_i <- MortalityLaw(
                x  = ages_,
                Dx  = deaths_i,   # vector with death counts
                Ex  = exposure_i, # vector containing exposures
                law = "HP2",
                opt.method = "LF2")

params <- modifyList(params, list(cause_deleted = coef(hp_fit_),
                                  cause_specific = coef(hp_fit_i)))

plot(hp_fit_i)

4. Ensuring Accurate Counts, Costs and QALYs

Count the number who become ill under various approaches.

Can also try Fernando’s approach

starting_age = .x = 40
starting_pop = N

tr_ <-
  bind_rows(c("healthy" = N, "sick" = 0, "cvddeath" = 0, "dead" = 0, "accsick" = 0)) %>% mutate(t = 0) %>% select(t,healthy,sick,cvddeath,dead,accsick)

tr_alt_ <- 
  bind_rows(c("healthy" = N, "sick" = 0, "cvddeath" = 0, "dead" = 0)) %>% mutate(t = 0) %>% select(t,healthy,sick,cvddeath,dead)

tr_nodeath <- 
  bind_rows(c("healthy" = N, "sick" = 0)) %>% mutate(t = 0) %>% select(t,healthy,sick)

# transition array
arr_ <- arr_alt_ <- arr_nodeath <- list() 

for (.x in 1:n_cycles) {
  
   current_age <- min(.x + starting_age - 1,max(lt_$age))
  
   r_death <- HP2(.x + starting_age - 1, params$cause_deleted)$hx
   r_cause <- lt_ %>% filter(age==current_age) %>% pull(mi)
    
   m_Q_nodeath <- 
     matrix(
       c(-params$r_HS,params$r_HS,
         0,0),
       byrow=TRUE,
       ncol = 2, 
       dimnames = list(c("healthy","sick"),c("healthy","sick"))
     )
   
   m_P_nodeath <- 
     expm(m_Q_nodeath)
   
  tmp_nodeath <- c(tr_nodeath[.x, "healthy"], tr_nodeath[.x, "sick"]) %>% unlist()
    
  tr_nodeath <-
        bind_rows(
            tr_nodeath,
            (tmp_nodeath %*% m_P_nodeath)  %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick)
        )
   
    # Correct Version
    m_Q <-
        matrix(
            c(-(r_death + params$r_HS), params$r_HS, 0, r_death,
                0, -(r_cause+r_death), r_cause,r_death,
              0,0,0,0,
              0,0,0,0),
            byrow = TRUE,
            ncol = 4,
            dimnames = list( c("healthy", "sick","cvddeath","dead"), c("healthy", "sick","cvddeath","dead"))
        )
    
    # Add the accumulator 
    m_Q_acc <- 
      cbind(rbind(m_Q,rep(0,nrow(m_Q))),rep(0,ncol(m_Q)+1))
    rownames(m_Q_acc) = c(rownames(m_Q),"accsick")
    colnames(m_Q_acc) = c(colnames(m_Q),"accsick")
    
    m_Q_acc["healthy","accsick"] <- m_Q_acc["healthy","sick"]

    m_P <-
        expm(m_Q_acc)
    
    # Using Formulas to Convert
    p_death <- 1 - exp(-r_death)
    p_sick <- 1 - exp(-params$r_HS)
    p_sick_death <- 1 - exp(-(r_cause))
    
    m_P_ <- 
      matrix(c(
       (1-p_death - p_sick), p_sick, 0,p_death,
       0,(1-p_sick_death - p_death),p_sick_death,p_death,
       0,0,0,0,
       0,0,0,1
      ),
      byrow=TRUE,
      ncol = 4,
            dimnames = list(c("healthy", "sick","cvddeath","dead"), c("healthy", "sick","cvddeath","dead"))
      )

    tmp_ <- c(tr_[.x, "healthy"], tr_[.x, "sick"], tr_[.x, "cvddeath"],tr_[.x, "dead"],tr_[.x,"accsick"]) %>% unlist()
    
    tr_ <-
        bind_rows(
            tr_,
            (tmp_ %*% m_P)  %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,cvddeath,dead,accsick)
        )
    
    arr_tmp <- diag(tmp_)
    arr_[[.x]] <- 
      arr_tmp %*% m_P
    
    tmp_alt_ <- c(tr_alt_[.x, "healthy"], tr_alt_[.x, "sick"], tr_alt_[.x, "cvddeath"], tr_alt_[.x, "dead"]) %>% unlist()
    
    tr_alt_ <-
        bind_rows(
            tr_alt_,
            (tmp_alt_ %*% m_P_)  %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,cvddeath,dead)
        )
    
    arr_alt_tmp <- diag(tmp_alt_)
    arr_alt_[[.x]] <- 
      arr_alt_tmp %*% m_P_
    
    
}

# Cycle adjustment
cycle_adj <- 
    c(rep(c(4/3,2/3),n_cycles/2),1/3)
cycle_adj[1] <- 1/3

# Embedded
payoff_sick <- matrix(c(0,0,0,0,1))
sick <- as.matrix(tr_[,c("healthy","sick","cvddeath","dead","accsick")]) %*% payoff_sick
sick_embedded <- t(cycle_adj) %*% sick

# Formulas
payoff_sick_alt <- matrix(c(0,1,0,0))
sick_alt <- as.matrix(tr_[,c("healthy","sick","cvddeath","dead")]) %*% payoff_sick_alt
sick_formula <- t(cycle_adj) %*% sick_alt

# No Death
payoff_sick_nodeath <- matrix(c(0,1))
sick_nodeath <- as.matrix(tr_nodeath[,c("healthy","sick")]) %*% payoff_sick_nodeath
sick_nodeath <- t(cycle_adj) %*% sick_nodeath

sick_nodeath
        [,1]
[1,] 8965048
sick_formula
        [,1]
[1,] 3166013
sick_embedded
        [,1]
[1,] 8706612

Why the divergence? At later years the hazard of death is very high – but the hazard of dying from CVD is also very high!

Let’s take a look at what happens in the first cycle.

Here is the Markov trace for the embedded matrix:

tr_[2,]
# A tibble: 1 × 6
      t healthy  sick cvddeath  dead accsick
  <dbl>   <dbl> <dbl>    <dbl> <dbl>   <dbl>
1     1  90336. 9499.     1.86  163.   9509.

And here it is for the matrix constructed using conversion formulas:

tr_alt_[2,]
# A tibble: 1 × 5
      t healthy  sick cvddeath  dead
  <dbl>   <dbl> <dbl>    <dbl> <dbl>
1     1  90321. 9516.        0  163.

We see that in both, 163 people die. But there are more people who remain sick in the trace based on the formulas (9516) than in the trace based on the embedded matrix (9499). Why?

9516 is the total who would become sick if we don’t allow any death in the interval for those who would become sick in the interval.

That is, let’s think about the population who would become sick in an interval. There are four mutually exclusive groups:

  1. Individuals who become sick in the interval and stay sick throughout the interval.
  2. Individuals who would have become sick in the interval, but died first.
  3. Individuals who became sick in the interval, but also died from the illness within the interval.
  4. Individuals who became sick in the interval, but also died from background causes within the interval.

The total number 9516 sick at the end of the interval based on a matrix converted using standard formulas reflects the union of all the above. We’re essentially “trapping” these patients within the interval and not allowing them to die for any other reason–either before or after they become ill.

Adding Non-Markovian Elements to the Transition Probability Matrix

Accumulators

Tunnel states

We will create a T1 state for tracking tunnels and allow exit from due to external death. This requires us to have a recieving bucket for these external risks ‘N’ for NULL so as not to disturb the balance of the Markov.

m = number of primary health states

  1. Create m x m rate matrix R with initial cell values of 0
  2. Fill in off-diagonal elements per parameters
  3. Diagonal of R is negative row sum of off-diagonal elements
  4. Add accumulator columns and rows as 0s. Call this new matrix R_
  5. Fill in appropriate accumulator transition rates in R_
  6. Add in tunnel states as new columns and rows
  7. Add an additional “N” (NULL) state as recieving bucket external risks (e.g., background mortality) that can occur while in the tunnel so as not to disturb the balance of the Markov.
  8. Take matrix exponential of R_ to obtain M_, the initial embedded transition probability matrix
  9. Make edits to R_ to reflect tunnel state activity.
  • Since it is possible to exit the tunnel for background mortality reasons, add the total background mortality
starting_age = 40

markov_rate_matrices<- function(t)
{
  lapply(t, function(tt){
    
   current_age <- min(tt + starting_age - 1,max(lt_$age))
  
   r_death <- HP2(current_age, params$cause_deleted)$hx
   r_cause <- lt_ %>% filter(age==current_age) %>% pull(mi) 
  
    x <- matrix(c(0.0, params$r_HS, 0.0, r_death,
                  0.0, 0.0, r_cause, r_death,
                  0.0, 0.0, 0.0, 0.0,
                  0.0, 0.0, 0.0, 0.0),
           nrow=4,
           ncol=4,
           byrow=TRUE,
           dimnames=list(c("healthy", "sick","cvddeath","dead"),
                         c("healthy", "sick","cvddeath","dead")))
  
    diag(x) <- -rowSums(x)
    x
  })
}

non_markov_rate_matrices <- function(t)
{
  markov     <- markov_rate_matrices(t)
  non_markov <- list()
  for(i in t)
  {
    # Expand matrix
    non_markov[[i]] <- cbind(markov[[i]],     matrix(rep(0, 4*3), nrow=4))
    non_markov[[i]] <- rbind(non_markov[[i]], matrix(rep(0, 3*7), nrow=3))
  }
  
  
  lapply(non_markov, function(m){
    # Put in State Names
    rownames(m) <- c("healthy", "sick", "cvddeath", "dead", "accsick", "sickT1", "N")
    colnames(m) <- c("healthy", "sick", "cvddeath", "dead", "accsick", "sickT1", "N")
    
    # Define Accumulator
    m["healthy", "accsick"] <- m["healthy", "sick"] 
    
    # Define Tunnel state entry
    m["healthy", "sickT1"]  <- m["healthy", "sick"]
    
    m  # Note: Tunnel states are not fully defined at this point.
  })

}

Next we embed the non-Markovian matrices into the time step and finish the definition of the tunnel states in the probability space.

transition_prob <- function(t,h=1)
{
  # Embed the matrices into the timesteps
  tp <- lapply(non_markov_rate_matrices(t), function(m) expm(m*h))
  
  lapply(tp, function(m)
  {
    # It is possible to exit tunnel to external risk of death
    m["sickT1", "N"]  <- m["healthy", "cvddeath"] + m["healthy", "dead"]
    m["sickT1", "sickT1"] <- 0  # Cannot remain in tunnel
    
    # last state in the tunnel is a terminal state
    m["sickT1", "N"]  <- 1

    # Note: At this point, the "N" state could be stripped as it was
    #       only required for the embedding, and serves no other purpose
    #       at this point
    
    states <- c("healthy", "sick", "cvddeath", "dead", "accsick", "sickT1")
    m[states, states]
  })
}

5. Backwards Conversion

m_P <- matrix(c(0.721, 0.202, 0.067, 0.010,
                               0.000, 0.581, 0.407, 0.012,
                               0.000, 0.000, 0.750, 0.250,
                               0.000, 0.000, 0.000, 1.000), 
                               nrow=4, byrow=TRUE,
                               dimnames=list(c("A", "B", "C", "D"),
                                             c("A", "B", "C", "D")))
m_P  
      A     B     C     D
A 0.721 0.202 0.067 0.010
B 0.000 0.581 0.407 0.012
C 0.000 0.000 0.750 0.250
D 0.000 0.000 0.000 1.000

The bottom diagonal was corrected to be a Markov absorbing state by having 1 on the diagonal. For the purposes of example, we’ll assume we wish to add an additional state “E”, which has a continuous rate of 0.2 that competes with other transitions.

Take the HIV model and use eigenvalue decomposition to obtain the continuous generator matrix.

To accomplish this we must find the continuous generator.

V  <- eigen(m_P)$vectors
iV <- solve(V)
Ap <- iV %*% m_P %*% V

Ap
                         [,1]                     [,2]
[1,]  1.000000000000000000000 -0.000000000000000011388
[2,]  0.000000000000000172422  0.750000000000000111022
[3,] -0.000000000000001054209 -0.000000000000000197747
[4,] -0.000000000000000062574 -0.000000000000000018793
                          [,3]                     [,4]
[1,]  0.0000000000000000000000  0.000000000000000016253
[2,]  0.0000000000000000000000 -0.000000000000000156542
[3,]  0.7209999999999999742428  0.000000000000000036544
[4,] -0.0000000000000000059637  0.580999999999999960920

Due to the numeric probabilities not being exactly correct, the off-diagonal elements of A' are not zero, but they are quite close. We will zero these off diagonal elements and assume that the non-zero elements are numerical error. Then continue by taking the log of the diagonal.

lAp <- diag(log(diag(Ap)), nrow(Ap), ncol(Ap))

R  <- V %*% lAp %*% iV

R 
                          [,1]                      [,2]       [,3]      [,4]
[1,] -0.3271161416971880009363  0.3114960917676688478828  0.0024395  0.013181
[2,]  0.0000000000000000025585 -0.5430045221302257640872  0.6148890 -0.071884
[3,]  0.0000000000000000000000  0.0000000000000000070641 -0.2876821  0.287682
[4,]  0.0000000000000000000000  0.0000000000000000000000  0.0000000  0.000000

An now we have the continuous time rate generator for the Markov Model. There is still some numerical error—for example the bottom row, has values very near to zero, and the diagonal is not exactly the negative sum of the rest of the row. We can clean this up by tweaking the numbers numerical towards their constraints.

R[abs(R) < 1e-6 ] <- 0


rownames(R) <- c("A", "B", "C", "D")
colnames(R) <- c("A", "B", "C", "D")

round(R, 3)
       A      B      C      D
A -0.327  0.311  0.002  0.013
B  0.000 -0.543  0.615 -0.072
C  0.000  0.000 -0.288  0.288
D  0.000  0.000  0.000  0.000

Some numerical error is inevitable in this process and cannot be avoided. However, even cleaning up the small and obvious errors, the transition from B -> D is impossible since it’s negative. Rates are relative to the occupancy of the source state, in this case B. Having a negative rate implies that B would have some transitions in which the dead come alive into B based upon the occupancy of B. Obviously, this is not possible. The problem now is how to adjust the model’s rates to be physically possible, while as faithful to the original data as possible. B is a sicker state, so it should have a higher death rate.

Demonstrate that the embedded (new) transition probability matrix successfully recreates the original transition probability matrix and results.

Let’s double check it recapitulates the original rates.

expm(R)
4 x 4 Matrix of class "dgeMatrix"
      A     B     C     D
A 0.721 0.202 0.067 0.010
B 0.000 0.581 0.407 0.012
C 0.000 0.000 0.750 0.250
D 0.000 0.000 0.000 1.000

Now let’s replicate the monotherapy strategy results from the original.

params_hiv <- 
  list(
    R = R,
    dmca =   1701 , # Direct medical costs associated with state A
    dmcb =   1774 , # Direct medical costs associated with state B
    dmcc =   6948 , # Direct medical costs associated with state C
    ccca =   1055 , # Community care costs associated with state A
    cccb =   1278 , # Community care costs associated with state B
    cccc =   2059 , # Community care costs associated with state C
    cAZT =   2278 , # Zidovudine drug cost
    cLam =   2087 , # Lamivudine drug cost
    RR    =  0.509, # Treatment effect (RR)
    cDR  =  0.06,   # Annual discount rate - costs (%)
    oDR  =  0       # Annual discount rate - benefits (%)
  )

starting_age = 40

markov_rate_matrices <- function(t)
{
  lapply(t, function(tt) {
    current_age <- min(tt + starting_age - 1, max(lt_$age))
    
    # r_death <- HP2(current_age, params$cause_deleted)$hx
    # r_cause <- lt_ %>% filter(age==current_age) %>% pull(mi)
    
    x <- params_hiv$R

    diag(x) <- 0
    diag(x) <- -rowSums(x)
    x
  })
}

transition_prob <- function(t,h=1)
{
  # Embed the matrices into the timesteps
  tp <- lapply(markov_rate_matrices(t), function(m) expm(m*h))
  
}

Y <- t(c(A = 1, B = 0, C = 0, D = 0))

res_mono <- do.call(rbind, lapply(transition_prob(1:20), function(tp) {
  Y <<- Y %*% tp
}))

res_mono <- rbind(
  c(A = 1, B = 0, C = 0, D = 0), 
  res_mono
)

res_mono <- round(res_mono,3)

payoff_health  = matrix(c(1,1,1,0),byrow=TRUE,ncol=1,dimnames=list(c("A","B","C","D"),c("qaly")))
qaly_mono <- as.matrix(res_mono[-1,]) %*% payoff_health
qaly_disc_mono <- qaly_mono / (1 + params_hiv$oDR)^c(1:(nrow(res_mono)-1))

payoff_cost_mono <- with(params_hiv,matrix(c(dmca + ccca + cAZT,
                        dmcb + cccb + cAZT,
                        dmcc + cccc + cAZT,
                        0)))
cost_mono <- as.matrix(res_mono[-1,]) %*% payoff_cost_mono %>% round(.,0)
cost_disc_mono <- cost_mono / (1 + params_hiv$cDR)^c(1:(nrow(res_mono)-1))
  

sum(qaly_disc_mono)
[1] 7.974
sum(round(cost_disc_mono,0))
[1] 44589

This compares favorably to the textbook answer, which is 7.99 for QALYs and £44,663 for costs. The main differences seem to be rounding in the Excel answer within the textbook.

6. Changing Countries or Contexts

Rather than have a static death rate, let’s swap in secular mortality from several different countries and regions.

tibble::tribble(
  ~code, ~country,
        "AUS",                              "Australia",
        "AUT",                               "Austria",
        "BLR",                               "Belarus",
        "BEL",                               "Belgium",
        "BGR",                              "Bulgaria",
        "CAN",                                "Canada",
        "CHL",                                 "Chile",
        "CZE",                        "Czech Republic",
        "DNK",                               "Denmark",
        "EST",                               "Estonia",
        "FIN",                               "Finland",
     "FRATNP",             "France (total population)",
     "FRACNP",          "France (civilian population)",
     "DEUTNP",            "Germany (total population)",
      "DEUTE",                        "Germany (east)",
      "DEUTW",                        "Germany (west)",
        "GRC",                                "Greece",
        "HUN",                               "Hungary",
        "ISL",                               "Iceland",
        "IRL",                               "Ireland",
        "ISR",                                "Israel",
        "ITA",                                 "Italy",
        "JPN",                                 "Japan",
        "LVA",                                "Latvia",
        "LTU",                             "Lithuania",
        "LUX",                            "Luxembourg",
        "NLD",                           "Netherlands",
     "NZL_NP",        "New Zealand (total population)",
     "NZL_MA",        "New Zealand (Maori population)",
     "NZL_NM",    "New Zealand (non-Maori population)",
        "NOR",                                "Norway",
        "POL",                                "Poland",
        "PRT",                              "Portugal",
        "RUS",                                "Russia",
        "SVK",                              "Slovakia",
        "SVN",                              "Slovenia",
        "ESP",                                 "Spain",
        "SWE",                                "Sweden",
        "CHE",                           "Switzerland",
        "TWN",                                "Taiwan",
     "GBR_NP",                        "United Kingdom",
    "GBRTENW",    "England & Wales (total population)",
    "GBRCENW", "England & Wales (civilian population)",
    "GBR_SCO",                              "Scotland",
    "GBR_NIR",                      "Northern Ireland",
        "USA",                                "U.S.A.",
        "UKR",                               "Ukraine"
    ) %>% 
kable() %>% 
  kable_styling()
code country
AUS Australia
AUT Austria
BLR Belarus
BEL Belgium
BGR Bulgaria
CAN Canada
CHL Chile
CZE Czech Republic
DNK Denmark
EST Estonia
FIN Finland
FRATNP France (total population)
FRACNP France (civilian population)
DEUTNP Germany (total population)
DEUTE Germany (east)
DEUTW Germany (west)
GRC Greece
HUN Hungary
ISL Iceland
IRL Ireland
ISR Israel
ITA Italy
JPN Japan
LVA Latvia
LTU Lithuania
LUX Luxembourg
NLD Netherlands
NZL_NP New Zealand (total population)
NZL_MA New Zealand (Maori population)
NZL_NM New Zealand (non-Maori population)
NOR Norway
POL Poland
PRT Portugal
RUS Russia
SVK Slovakia
SVN Slovenia
ESP Spain
SWE Sweden
CHE Switzerland
TWN Taiwan
GBR_NP United Kingdom
GBRTENW England & Wales (total population)
GBRCENW England & Wales (civilian population)
GBR_SCO Scotland
GBR_NIR Northern Ireland
USA U.S.A.
UKR Ukraine
#hmd.ger <- demography::hmd.mx("DEUTNP",username = "", password = "", "")
#write_rds(hmd.ger,file=here("_sandbox/mortality/germany-life-table.rds"))

#hmd.pol <- demography::hmd.mx("POL",username = "", password = "", "POL")
#write_rds(hmd.pol,file=here("_sandbox/mortality/poland-life-table.rds"))

Fit a mortality model to German life-table data and use the mortality rate from this model to estimate QALYs and costs

# Fit the mortality model

min_age = 40
max_age = 99

hmd_ger <- read_rds(here("_sandbox/mortality/germany-life-table.rds"))

lt_ger <-  # Read in the U.S. life table data from the human mortality database.
  hmd_ger %>% 
  lifetable(.,series = "total", years = mort_year) %>% 
  as_tibble() %>% 
  mutate_at(vars(lx,dx), function(x) x*N) %>% 
  mutate(country = "USA") %>% 
  mutate(age = x)

ages     <- lt_ger$x[lt_ger$x<=max_age & lt_ger$x>=min_age]
deaths   <- lt_ger$dx[lt_ger$x<=max_age & lt_ger$x>=min_age]
exposure <- lt_ger$lx[lt_ger$x<=max_age & lt_ger$x>=min_age]

hp_fit_ger <- MortalityLaw(
                x  = ages,
                Dx  = deaths,   # vector with death counts
                Ex  = exposure, # vector containing exposures
                law = "HP2",
                opt.method = "LF2")

params_hiv <- modifyList(params_hiv, list(mort = coef(hp_fit_ger),
                                          hr_mort = 2))
markov_rate_matrices <- function(t)
{
  lapply(t, function(tt) {
    current_age <- min(tt + starting_age - 1, max(lt_$age))
    
    r_death <- params_hiv$hr_mort * HP2(current_age, params_hiv$mort)$hx

    x <- params_hiv$R
    x["A","D"] <- r_death

    diag(x) <- 0
    diag(x) <- -rowSums(x)
    x
  })
}

transition_prob <- function(t,h=1)
{
  # Embed the matrices into the timesteps
  tp <- lapply(markov_rate_matrices(t), function(m) expm(m*h))
  tp
}

Y <- t(c(A = 1, B = 0, C = 0, D = 0))

res_mono <- do.call(rbind, lapply(transition_prob(1:20), function(tp) {
  Y <<- Y %*% tp
}))

res_mono <- rbind(
  c(A = 1, B = 0, C = 0, D = 0), 
  res_mono
)

res_mono <- round(res_mono,3)

payoff_health  = matrix(c(1,1,1,0),byrow=TRUE,ncol=1,dimnames=list(c("A","B","C","D"),c("qaly")))
qaly_mono <- as.matrix(res_mono[-1,]) %*% payoff_health
qaly_disc_mono <- qaly_mono / (1 + params_hiv$oDR)^c(1:(nrow(res_mono)-1))

payoff_cost_mono <- with(params_hiv,matrix(c(dmca + ccca + cAZT,
                        dmcb + cccb + cAZT,
                        dmcc + cccc + cAZT,
                        0)))
cost_mono <- as.matrix(res_mono[-1,]) %*% payoff_cost_mono %>% round(.,0)
cost_disc_mono <- cost_mono / (1 + params_hiv$cDR)^c(1:(nrow(res_mono)-1))
  
sum(qaly_disc_mono)
[1] 8.251
sum(round(cost_disc_mono,0))
[1] 45928

7. Inpororating new evidence

Suppose a new combination therapy treatment has been approved. What is the cost effectiveness relative to monotherapy?

8. Solving for PSA parameters

The new combination therapy arm has some new parameters with 95% CIs.

Solve for the PSA distribution parameters

How Do We Draw PSA Values?

Parameter Type Distribution
Probability beta
Rate gamma
Utility weight beta
Right skew (e.g., cost) gamma, lognormal
Relative risks or hazard ratios lognormal
Odds Ratio logistic

Normal Distribution

\sigma = \frac{x_2 - x_1}{\Phi^{-1}(p_2)-\Phi^{-1}(p_1)} \tag{1}

\mu = \frac{x_1\Phi^{-1}(p_2)-x_2\Phi^{-1}(p_1)}{\Phi^{-1}(p_2)-\Phi^{-1}(p_1)}

x1 <- qnorm(0.1,1.3,.2)
p1 <- 0.1

x2 <- qnorm(0.9,1.3,.2)
p2 <- 0.9

param_normal <- function(x1,x2,p1,p2) {
    sigma = (x2 - x1) / (qnorm(p2,0,1)-qnorm(p1,0,1)); sigma
    mu <- (x1*qnorm(p2,0,1)-x2*qnorm(p1,0,1))/(qnorm(p2,0,1)-qnorm(p1,0,1)); mu
    
    c("mu" = mu, "sigma" = sigma)
}

Gamma Distribution

x1 <- 0.6
x2 <- 0.8
p1 <- 0.1
p2 <- 0.9

gamma_fn <- function(alpha) {
    x1*qgamma(p2,shape = alpha, scale =1) - x2 * qgamma(p1, shape = alpha, scale = 1)
}

calc_beta <- function(x1,p1,alpha) {
    x1 / qgamma(p1,alpha,1)
}

curve(gamma_fn, xlim = c(1,100), col = "blue", lwd = 1.5, lty=2)
abline(a=0,b=0)

alpha_ <- uniroot(gamma_fn,c(70,85))$root; alpha_
[1] 79.748
beta_ <- calc_beta(x1 = x1,  p1 = p1, alpha = alpha_); beta_
[1] 0.0087542
# Check the answer
qgamma(0.1,shape = alpha_, scale = beta_)
[1] 0.6
qgamma(0.9,shape = alpha_, scale = beta_)
[1] 0.8
param_gamma <- function(x1,x2,p1,p2,range) {
    alpha_ <- uniroot(gamma_fn,range)$root; alpha_
    beta_ <- calc_beta(x1 = x1,  p1 = p1, alpha = alpha_); beta_
    
    c("alpha" = alpha_, "beta" = beta_)
}
param_gamma(x1= x1, x2 = x2, p1 = p1, p2 = p2, range = c(60,100))
     alpha       beta 
79.7475577  0.0087542 

Beta Distribution

# Source: https://stats.stackexchange.com/questions/112614/determining-beta-distribution-parameters-alpha-and-beta-from-two-arbitrary

x1 <- 0.6
x2 <- 0.8
p1 <- 0.1
p2 <- 0.9

# Logistic transformation of the Beta CDF.
f.beta <- function(alpha, beta, x, lower=0, upper=1) {
  p <- pbeta((x-lower)/(upper-lower), alpha, beta)
  log(p/(1-p))
}

# Sums of squares.
delta <- function(fit, actual) sum((fit-actual)^2)

# The objective function handles the transformed parameters `theta` and
# uses `f.beta` and `delta` to fit the values and measure their discrepancies.
objective <- function(theta, x, prob, ...) {
  ab <- exp(theta) # Parameters are the *logs* of alpha and beta
  fit <- f.beta(ab[1], ab[2], x, ...)
  return (delta(fit, prob))
}

x.p <- (function(p) log(p/(1-p)))(c(p1, p2))
start <- log(c(1e1, 1e1))
sol <- nlm(objective, start, x=c(x1,x2), prob=x.p, lower=0, upper=1, typsize=c(1,1), fscale=1e-12, gradtol=1e-12)
params <- exp(sol$estimate); params
[1] 23.707 10.031
qbeta(p = c(p1, p2), params[1], params[2])
[1] 0.6 0.8

9. Simulating PSA values from copulas

######################################
## Option 1: Cholesky Decomposition
######################################
# Source: Coping with Copulas Sec 5.1

# 1. Define the correlation matrix
  rho <- 0.6
  sigma <- matrix(c(1,rho,rho,1),nrow = 2, byrow=TRUE)

# 2. Perform a cholesky decomposition
  A = chol(sigma)

# 3. Generate iid standard normal pseudo random variables
  tildeY0 <- rnorm(n = 1e6, mean = 0, sd = 1)
  tildeY1 <- rnorm(n = 1e6, mean = 0, sd = 1)
  tildeY <- cbind(tildeY0,tildeY1)

  # Collect them 
  tildeY <- tildeY %*% A

# Use the standard normal disribution function to return the quantiles
  U <- pnorm(tildeY)

# Use the inverse distribution function to return the values
  Y <- U 
  Y[,1] <- qpois(U[,1],lambda = 10)
  Y[,2] <- qpois(U[,2],lambda = 12)

# Confirm the correct correlation 
cor(Y[,1],Y[,2])
[1] 0.59356
##########################
## Option 2: Use MVRNORM
##########################
U_ <- pnorm(mvrnorm(n = 1e6, Sigma = sigma, mu = c(0,0)))
Y_ <- U_
Y_[,1] <- qpois(U_[,1],lambda = 10)
Y_[,2] <- qpois(U_[,2],lambda = 12)

cor(Y_[,1],Y_[,2])
[1] 0.5921

Simulate direct medical care and community care costs for each state using copulas.

rho <- 0.6
sigma <- matrix(c(1,rho,rho,1),nrow = 2, byrow=TRUE)

U_ <- pnorm(mvrnorm(n = 1e3, Sigma = sigma, mu = c(0,0)))
Y_ <- U_
Y_[,1] <- qpois(U_[,1],lambda = 10)
Y_[,2] <- qpois(U_[,2],lambda = 12)

cor(Y_[,1],Y_[,2])
[1] 0.60362